-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vendor tomli to break the circular dependency. #483
Conversation
(I'm repeating myself from other issues, but I want to record this here rather than assuming people will wade through those long discussions) I get that vendoring would make some use cases easier. But it makes other use cases more difficult, because I know some downstreams (Linux distros in particular) will unvendor it and then have to solve the same dependency cycle that the vendoring avoids. So I'd rather look for a solution that doesn't involve vendoring. I believe I have the outline of one that only requires a minor tweak for redistributors: build flit_core (which only requires Python itself), then use tomli from source with flit_core to build tomli. You see this as some sort of eldritch abomination, and I imagine if you're used to compiled languages, using a library before building it is a weird idea. But from my perspective, this is Python imports working precisely as designed, and a perfectly reasonable approach. If bundling packages together is acceptable to you, look at https://github.com/FFY00/python-bootstrap - whether or not that implementation is acceptable to you, look at the idea. It's essentially the same idea, just taken a step further: bundle all the fundamental packaging tools together and use them to build & install one another. That's a simple way out of any dependency cycles. |
If they already do that with pip it must not be that difficult for them to also do that here since they clearly have the tooling for it IMO.
It's more an issue with us not having the ability to easily handle circular toolchain dependencies(basically our entire toolchain is built from source in stages with very little base build host dependencies), if I define a circular dependency between the
I mean using a project's bundled library to build itself is one thing, having an external circular dependency is something totally different IMO. If a package depends on itself to build we can handle that as it doesn't create a circular dependency.
I'm just not seeing a way to do something like this in a way that can be up-streamed. We have build/install stage isolation features that make that approach very difficult. |
This circular dependency(among with a number of other issues I've raised) is also an issue for other distros like gentoo as well:
I think it's really being underestimated how big a pain point this is for distros, there's IMO absolutely no reason build tools like |
But if those downstream integrators would add support for circular dependencies they'd solve this problem forever 🤔 for all packages. So IMHO could be worth tackling that challenge. |
From my understanding supporting circular dependencies on our side isn't possible, buildroot uses |
I'm aware that the circular dependency is going to cause some difficulty for any downstream packagers. What I'd like to know better is:
I'd like to hear this also from downstreams other than buildroot. 😉 |
I am quite happy with Fedora's current solution in tomli, but I am not sure if it scales well in case flit_core gains more and more dependencies that use flit_core to build. In that case, we might consider treating tomli as a non-standard case instead of treating the dependencies as such. |
But that is not the case, is it? flit_core actually requires tomli. |
It requires tomli only to run. But to build flit_core only Python is required. |
Even if that is the case (and I am afraid we failed to manage to do that in Fedora, but I can check), we wold only manage to build flit_core that isn't installable before we also build tomli. And we cannot build tomli because flit_core isn't installable. |
We could build a special variant of flit_core that does not declare the dependency, build tomli with it (bend it so flit_core imports tomli from source), and then build flit_core reguralry. That however requires treating both flit_core and tomli as special cases. |
For us it seems to be impossible to resolve the circular dependency issue in a reasonably maintainable way downstream.
Yeah, patching in a bunch of
For us the key thing is any solution must break the circular dependency graph.
It's much easier for a downstream like debian to devendor a package designed to work either in a vendored or devendored configuration than for distros that can't handle circular dependencies to vendor something. I realize it's not ideal to vendor dependencies but I can't think of another way to handle this issue reliably. If there isn't a solution for devendoring pip why would there be one for other build tools? I don't see how this problem is any different from the one
Yeah, this will also be an issue for
For us this build vs run dependency distinction doesn't make much of a difference, if
Yeah, for us the only way to do something like this is to maintain multiple versions of a package, but doing so would significantly complicates our toolchain bootstrap process. This would quickly become an unmaintainable mess. |
It's very much the plan that it won't gain more and more dependencies. The reason for making flit_core a separate package was for it to have minimal dependencies. Parsing TOML is fairly unavoidable, but other than that I don't foresee a need for any extra dependencies. If tomli was vendored in flit_core, would you (Fedora, and any other downstreams - I know the view from buildroot):
As @jameshilliard mentions, it looks like we might be heading towards having a few other key packaging pieces built with Flit (installer, pep517, packaging, build). This isn't necessarily a cycle - I understand from your PackagingCon talk that Fedora has its own tooling to work with PEP 517 interfaces, so I imagine |
I have also wondered about publishing a |
How would one avoid this creating a cycle here other than vendoring? I mean the current way those packages avoid creating a cycle is to have setuptools/distutils fallbacks. If |
That was my understanding as well and the only reason I decided to build tomli the way we do instead of trying to solve this in flit_core.
Probably one of the latter 2 options. De-vendoring it afterward sounds reasonable, as long as our packaged tomli isn't much different from the vendored one (different version with different API, custom patches in vendored lib, etc.).
Well, our custom tooling still shells out to |
To clarify we would probably still carry |
Yeah, we haven't supported |
Also, note that Fedora's tooling uses |
I wholeheartedly agree with vendoring tomli. Setuptools also vendors its dependencies, thanks to maintainers of package managers like ourselves: pypa/setuptools#980
Yes! For those who don't know, Spack is another from-source package manager designed for supercomputers. My solution in Spack was actually the reverse of what is done here. Since flit_core needs tomli at run-time, but tomli only needs flit_core at build-time, I download a copy of flit_core and use it to build tomli. Then flit_core simply has a normal dependency on tomli. This is only possible because none of these packages are compiled, and I can simply add the source code to my
I think there are different levels of guiltiness when it comes to vendoring. Some packages like PyTorch/Tensorflow vendor their dependencies even when they don't need to, and rely on very specific commits (not stable releases) of their dependencies. This is a nightmare for package managers. In the case where vendoring happens to allow a package to build from source without cyclic deps, this is okay in my mind. Spack does the same thing, vendoring its own dependencies so that users don't need to install anything to get a working package manager. I think the important thing is that:
|
Thanks all. It sounds like there's maybe less objection to vendoring than I had thought, though I'd still like to hear from other downstreams - if you know of any people who might be working on this in other environments, do ping them. It's only tomli that I'd consider vendoring in What else can we do to make it feasible to bootstrap these packaging tools? Ideas I've had so far:
|
Agree, if anything it would make sense to vendor |
Maybe it would make sense to have a vendored version of |
|
That is fundamentally incompatible with the model that Python's packaging story has adopted with https://www.python.org/dev/peps/pep-0517/ -- packages should be able to use any sort of custom build-backend from any Python package they want, as long as it is a build-time dependency declared in the |
Similarly to how there isn't a single "blessed" build backend, I don't think there is a single "blessed" build frontend either. I don't think build/installer are very popular compared to pip. I'm currently overhauling the way Spack builds/installs Python packages, so I was looking at using build/installer to do this, but both of them have a ton of dependencies making bootstrapping an absolute nightmare. I ended up using pip instead. This isn't directly relevant to the conversation at hand, so I don't want to sidetrack this PR, but if anyone has any strong thoughts about build/installer vs. pip, please comment on spack/spack#27798. |
FWIW, vendoring and installing from wheels have the same issue: there's a second "source" that needs to be independently audited and maintained. (Assuming wheels aren't used as the "main" source, which is often problematic as they omit tests or documentation.)
This looks similar to the Fedora approach of unpacking directly to site-packages, except it's not Fedora-specific. Nice! And it also skips dependency checking, so it removes the need for a |
I've already switched Gentoo to copy the files from the wheel directly. Right now the thing that sucks the most is that we have to download two separate archives with some common files: wheel to get the dist-info, and github archive to get tests. |
I'm not involved in I feel like I'm repeating myself, but if you want a bundle of all of these projects together to simplify bootstrapping, you can make that yourself. It doesn't have to be an official release of one of these projects. The python-bootstrap repo shows how you can use the projects together to get them all installed.
(Responded on the Spack issue - there is a reason they exist, but if it's easier for you to use pip, that's fine)
Thanks - it sounds like that would be useful for at least some use cases. To be clear, this would work by building a wheel and unpacking it again. Let's take any further discussion of that idea to #481. 🙂
Are you talking about |
I guess it would probably make more sense to vendor
Additional backends are only a problem if they have circular dependencies, I think the important part is that the build frontends like
I think if
Vendoring should be done without namespace pollution, meaning the vendored dependency should only be directly used by the package that vendored it. This should be present in the main repo/sdist along the lines of
I think you get better deduplication of build+install functionality by vendoring everything needed to build and install
Yeah, we really don't want to have to do something like this.
So the problem with that approach and why upstream vendoring is a lot better is that you would get namespace pollution if you just combine the installs naively like in |
Sorry, I didn't look to which bugtracker I am replying ;-). I was talking of tomli for now, as setuptools started depending on it. Flit I'm still building via pp2sp and setuptools but I'm getting really tired of having to repeat all the awful hacks (like implicit |
Isn't setuptools adding a properly vendored version of |
By the way I just noticed that PEP-517 explicitly prohibits any sort of dependency cycles in the build requirements section:
Based on my reading of this PEP-517 provision any packages with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, accepted the principle of vendoring. To implementation.
My thoughts are basically: let's keep it as simple as possible.
@@ -6,7 +6,7 @@ | |||
from unittest.mock import patch | |||
import pytest | |||
|
|||
import tomli | |||
from flit_core._vendor import tomli |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's revert this and ensure tomli
is installed for tests in Flit (not flit_core
).
nox.options.reuse_existing_virtualenvs = True | ||
|
||
@nox.session | ||
def vendoring(session: nox.Session) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not keen on involving an extra automation tool (nox) and extra bits of CI logic just for this.
TBH, I think Pradyun's vendoring
tool is overkill here. I can absolutely see why you'd want it for pip, where there are 25 vendored packages already, but we're only vendoring one, there should be no need to vendor more, and I'd like to make it clear that it's just one. Hopefully even that one will be unnecessary in a few years (if a TOML parser is added to the standard library).
So I think it's easier to update it by hand from time to time than to manage extra code for updating it.
"""A lil' TOML parser.""" | ||
|
||
__all__ = ("loads", "load", "TOMLDecodeError") | ||
__version__ = "1.2.2" # DO NOT EDIT THIS LINE MANUALLY. LET bump2version UTILITY DO IT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Version 1.2.3 has been released now. It fixes one bug. We may want to update Tomli to that.
I've opened #492 to do this but without the extra bits of tooling and automation - it's just one package, so I want to keep it simple. |
Minimal vendoring implementation based on pip vendoring.
This seems to be straightforward maintenance wise and should make things a lot easier when it comes to bootstrapping
flit_core
.